Automatic speech emotion recognition using modulation spectral features

نویسندگان

Siqing Wu

Tiago H. Falk

Wai-Yip Chan

چکیده

In this study, modulation spectral features (MSFs) are proposed for the automatic recognition of human affective information from speech. The features are extracted from an auditory-inspired long-term spectro-temporal representation. Obtained using an auditory filterbank and a modulation filterbank for speech analysis, the representation captures both acoustic frequency and temporal modulation frequency components, thereby conveying information that is important for human speech perception but missing from conventional short-term spectral features. On an experiment assessing classification of discrete emotion categories, the MSFs show promising performance in comparison with features that are based on mel-frequency cepstral coefficients and perceptual linear prediction coefficients, two commonly used short-term spectral representations. The MSFs further render a substantial improvement in recognition performance when used to augment prosodic features, which have been extensively used for emotion recognition. Using both types of features, an overall recognition rate of 91.6% is obtained for classifying seven emotion categories. Moreover, in an experiment assessing recognition of continuous emotions, the proposed features in combination with prosodic features attain estimation performance comparable to human evaluation. 2010 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

Recognition of Human Emotion in Speech Using Modulation Spectral Features and Support Vector Machines

Automatic recognition of human emotion in speech aims at recognizing the underlying emotional state of a speaker from the speech signal. The area has received rapidly increasing research interest over the past few years. However, designing powerful spectral features for high-performance speech emotion recognition (SER) remains an open challenge. Most spectral features employed in current SER te...

متن کامل

Modulation Spectral Features for Predicting Vocal Emotion Recognition by Simulated Cochlear Implants

It has been reported that vocal emotion recognition is challenging for cochlear implant (CI) listeners due to the limited spectral cues with CI devices. As the mechanism of CI, modulation information is provided as a primarily cue. Previous studies have revealed that the modulation components of speech are important for speech intelligibility. However, it is unclear whether modulation informati...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Speech Communication

دوره 53 شماره

صفحات -

تاریخ انتشار 2011

Automatic speech emotion recognition using modulation spectral features

نویسندگان

چکیده

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Classification of emotional speech using spectral pattern features

Recognition of Human Emotion in Speech Using Modulation Spectral Features and Support Vector Machines

Modulation Spectral Features for Predicting Vocal Emotion Recognition by Simulated Cochlear Implants

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

عنوان ژورنال:

اشتراک گذاری